Generalized likelihood ratio test for voiced/unvoiced decision using the harmonic plus noise model
نویسندگان
چکیده
In this paper, a novel method for voiced / unvoiced decision in speech and music signals is presented. Voiced unvoiced decision is required for many applications, including better modeling for analysis/synthesis, detection of model changes for segmentation purposes and better signal characterization for indexing and recognition applications. The proposed method is based on the Generalized Likelihood Ratio Test (GLRT) and assumes colored Gaussian noise with unknown covariance. Under voiced hypothesis, a harmonic plus noise model is assumed. The derived method is combined with a Maximum A-posteriori Probability (MAP) scheme to obtain a voiced unvoiced tracking algorithm. The performance of the proposed method is tested under the Keele University database for different signal-to-noise ratios (SNRs), and the results show that the algorithm performs well even under severe noise conditions.
منابع مشابه
Uniform concatenative excitation model for synthesising speech without voiced/unvoiced classification
In general, speech synthesis using the source-filter model of speech production requires the classification of speech into two classes (voiced and unvoiced) which is prone to errors. For voiced speech, the input of the synthesis filter is an approximately periodic excitation, whereas it is a noise signal for unvoiced. This paper proposes an excitation model which can be used to synthesise both ...
متن کاملApproximate Kalman Filtering for the Harmonic plus Noise Model
We present a probabilistic description of the Harmonic plus Noise Model (HNM) for speech signals. This probabilistic formulation permits Maximum Likelihood (ML) parameter estimation and speech synthesis becomes a straightforward sampling from a distribution. It also permits development of a Kalman filter that tracks model parameters such as pitch, harmonic amplitudes, and autoregressive coeffic...
متن کاملA Statistical Model-Based V/UV Decision under Background Noise Environments
In this letter, we propose an approach to incorporate a statistical model for the voiced/unvoiced (V/UV) speech decision under background noise environments. Our approach consists of splitting the input noisy speech into two separate bands and applying a statistical model for each band. We compute and compare the likelihood ratio (LR) for each band based on the statistical model and estimated n...
متن کاملImprovements in Speaker Characterization Using Spectral Subband Energy Based on Harmonic plus Noise Model
We previously proposed the use of Spectral Subband Energy Ratio (SSER) as speaker features in a speaker verification system[1]. Those SSER features were derived from two distinct components-the harmonic and noise speech parts, which were decomposed by the Harmonic plus Noise Model(HNM) from the original speech. In this paper, we report several recent improvements to this approach. First, we go ...
متن کاملRobust automatic continuous-speech recognition based on a voiced-unvoiced decision
In this paper, the implementation of a robust front-end to be used for a large-vocabulary Continuous Speech Recognition (CSR) system based on a Voiced-Unvoiced (V-U) decision has been addressed. Our approach is based on the separation of the speech signal into voiced and unvoiced components. Consequently, speech enhancement can be achieved through processing of the voiced and the unvoiced compo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003